Trash article detection using categorization techniques
نویسندگان
چکیده
We explore techniques for detecting news articles containing invalid information, using the help of text categorization technology. The information that exists on the World Wide Web is huge enough in order to distract the users when trying to find useful information. In order to overcome the large amounts of data many methodologies of text categorization have been presented. One major problem we have to deal with is that many articles fetched by a crawler, then stored in a back-end database, and finally given as an input to a categorization subsystem, may not contain valid information for the user (trashy articles). This may lead to the user losing his trust towards the system. In this paper, we analyze the special properties of trashy news articles’ categorization that allows us to detect them and we propose a specific methodology for trash detection. Finally, we evaluate the proposed algorithm on a news categorization system and we depict the overall benefit of a trash detection mechanism on the system.
منابع مشابه
Rice Classification and Quality Detection Based on Sparse Coding Technique
Classification of various rice types and determination of its quality is a major issue in the scientific and commercial fields associated with modern agriculture. In recent years, various image processing techniques are used to identify different types of agricultural products. There are also various color and texture-based features in order to achieve the desired results in this area. In this ...
متن کاملTrash detection system for a citrus canopy shake and catch harvester using machine vision
Automatic estimation of the amount of trash, such as branches and leaves, collected by a mechanical citrus harvester during harvesting eliminates problems in the processing plants with handling diseased leaves and fruit. A machine vision system was developed to estimate the amount of trash collected by a citrus canopy shake and catch harvester by acquiring and analyzing the images of the harves...
متن کاملRobotic Detection of Marine Litter Using Deep Visual Detection Models
Trash deposits in aquatic environments have a destructive effect on marine ecosystems and pose a long-term economic and environmental threat. Autonomous underwater vehicles (AUVs) could very well contribute to the solution of this problem by finding and eventually removing trash. A step towards this goal is the successful detection of trash in underwater environments. This paper evaluates a num...
متن کاملHigh Speed Trash Measurements
This paper discusses the identification of trash objects in cotton using machine vision-based systems. Soft computing techniques such as neural networks and fuzzy inference systems can classify trash objects into individual categories such as bark, stick, leaf, and pepper trash types with great accuracies. High speed trash measurements, enables the implementation of these techniques for on-line...
متن کاملDetection and Elimination of Trash using Machine Vision and Extended De-Stemmer for a Citrus Canopy Shake and Catch Harvester
The main objective of this research was to design an efficient trash removal system and quantify the amount of trash materials such as leaves and twigs, generated during harvesting by a continuous citrus canopy shake and catch harvester, and to compare the efficiency of two destemmers with different lengths. A regular de-stemmer with a set of ten 24-inch long rollers and an extended de-stemmer ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009